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Track finding and fitting algorithm in the ALICE Time projection chamber (TPC) based on Kalman-filtering 
is presented. Implementation of particle identification (PID) using dE/dx measurement is discussed. Filtering 
and PID algorithm is able to cope with non-Gaussian noise as well as with ambiguous measurements in a 
high-density environment. The occupancy can reach up to 40% and due to the overlaps, often the points along 
the track are lost and others are significantly displaced. In the present algorithm, first, clusters are found and 
the space points are reconstructed. The shape of a cluster provides information about overlap factor. Fast 
spline unfolding algorithm is applied for points with distorted shapes. Then, the expected space point error is 
estimated using information about the cluster shape and track parameters. Furthermore, available information 
about local track overlap is used. Tests are performed on simulation data sets to validate the analysis and to 
gain practical experience with the algorithm. 
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1. Introduction 

Track finding for the predicted particle densities is 
one of the most challenging tasks in the ALICE ex- 
periment 1]. It is still under development and here 
the current status is reported. Track finding is based 
on the Kalman-filtering approach. Kalman-like algo- 
rithms are widely used in high-energy physics experi- 
ments and their advantages and shortcomings are well 
known. 

There are two main disadvantages of the Kalman 
filter, which affect the tracking in the ALICE TPC 
• The first is that before applying the Kalman- filter 
procedure, clusters have to be reconstructed. Occu- 
pancies up to 40% in the inner sectors of the TPC 
and up to 20% in the outer sectors are expected; clus- 
ters from different tracks may be overlapped; there- 
fore a certain number of the clusters are lost, and the 
others may be significantly displaced. These displace- 
ments are rather hard to take into account. Moreover, 
these displacements are strongly correlated depending 
on the distance between two tracks. 

The other disadvantage of the Kalman-filter track- 
ing is that it relies essentially on the determination 
of good 'seeds' to start a stable filtering procedure. 
Unfortunately, for the tracking in the ALICE TPC 
the seeds using the TPC data themselves have to be 
constructed. The TPC is a key starting point for the 
tracking in the entire ALICE set-up. Until now, prac- 
tically none of the other detectors have been able to 
provide the initial information about tracks. 

On the other hand, there is a whole list of very 
attractive properties of the Kalman-filter approach. 

• It is a method for simultaneous track recognition 
and fitting. 

• There is a possibility to reject incorrect space 
points 'on the fly', during the only tracking pass. 
Such incorrect points can appear as a conse- 
quence of the imperfection of the cluster finder. 



They may be due to noise or they may be points 
from other tracks accidentally captured in the 
list of points to be associated with the track un- 
der consideration. In the other tracking meth- 
ods one usually needs an additional fitting pass 
to get rid of incorrectly assigned points. 

• In the case of substantial multiple scattering, 
track measurements are correlated and therefore 
large matrices (of the size of the number of mea- 
sured points) need to be inverted during a global 
fit. In the Kalman-filter procedure we only have 
to manipulate up to 5 x 5 matrices (although 
many times, equal to the number of measured 
points), which is much faster. 

• Using this approach one can handle multiple 
scattering and energy losses in a simpler way 
than in the case of global methods. 

• Kalman filtering is a natural way to find the 
extrapolation of a track from one detector to 
another (for example from the TPC to the ITS 
or to the TRD). 

The following parametrization for the track was 
chosen: 

y(x) = Vo-^y/l-(Cx-r,)* (1) 



Z(x) = Zq 



tan A 
C 



arcsin(Ca; — rf) 



(2) 



The state vector x T is given by the local track position 
x, y and z, by a curvature C, local xq position of the 
helix center, and dip angle A: 



(y, z, C, tan A, 77), 77 = Cx Q 



(3) 



Because of high occupancy the standard Kalman 
filter approach was modified. We tried to find max- 
imum additional possible information which can be 
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Figure 1: Schematic view of the detection process in TPC 
(upper part - perspective view, lower part - side view). 



used during cluster finding, tracking and particle iden- 
tification. Because of too many degrees of freedom 
(up to 220 million 10-bit samples) we have to find a 
smaller number of orthogonal parameters. 

To enable using the optimal combination of local 
and global information about the tracks and clusters, 
the parallel Kalman filter tracking method was pro- 
posed. Several hypothesis are investigated in parallel. 
The global tracking approach such as Hough trans- 
form was considered only for seeding of track candi- 
dates. In the following, the additional information 
which was used will be underlined. 



the track projected onto the pad plane and pad-row is 
relevant. For the measurement of the the drift coordi- 
nate (z-direction) it is the angle (3 between the track 
and z axis (fig. QJ. 

The ionization electrons are randomly distributed 
along the particle trajectory. Fixing the reference x 
position of an electron at the middle of pad-row, the y 
(resp. z) position of the electron is a random variable 
characterized by uniform distribution with the width 
L a , where L a is given by the pad length L pa d and the 
angle a (resp. j3): 

L& = ipad tan a 

The diffusion smears out the position of the electron 
with gaussian probability distribution with od ■ Con- 
tribution of the ExB and unisochronity effects for the 
Alice TPC are negligible. The typical resolution in the 



case of ALICE TPC is on the level of er„ 



0.8 mm 



and <j z 
TPC. 



1.0 mm integrating over all clusters in the 



2.1. Gas gain fluctuation effect 

Being collected on sense wire, electron is "multi- 
plied" in strong electric field. This multiplication is 
subject of a large fluctuations, contributing to the 
cluster position resolution. Because of these fluctu- 
ations the center of gravity of the electron cloud can 
be shifted. 

Each electron is amplified independently. However, 
in the reconstruction electrons are not treated sepa- 
rately. The Centre Of Gravity (COG) of the cluster is 
usually used as an estimation for the local track posi- 
tion. The influence of the gas gain fluctuation to the 
reconstructed point characteristic can be described by 
a simple model, introducing a weighted COG Acog 



Acog = 



2-*ii=l 9i x i 

Li=i 9i 



(4) 



where N is the total number of electrons in the cluster 
and gi is a random variable equal to a gas amplifica- 
tion for given electron. 

The mean value of Acog is equal to the mean value 
x of the original distribution of electrons 



2. Accuracy of local coordinate 
measurement 

The accuracy of the coordinate measurement is lim- 
ited by a track angle which spreads ionization and by 
diffusion which amplifies this spread. 

The track direction with respect to pad plane is 
given by two angles a and (3 (see fig. QJ. For the 
measurement along the pad-row, the angle a between 



X COG - — ZJj = X—j^ 

2_(i=l 9i l^ii=\9i 



(5) 



However, the same is not true for the dispersion of 
the position, 



°Xcog = X cog ~ X cog 



L^i=\ 9i i—i 



N 

E 



9iXi 



x 2 
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where 



G 



gfactor 



= N 



E<7? 



E E 9i9j 



(6) 



(7) 



The diffusion term is effectively multiplied by gas 
gain factor G g f ac tor- For sufficiently large number of 
electrons, when g 2 and ^ ^2 9*9 j are quasi indepen- 
dent variables, equation J7J) can be transformed to the 
following 



^ N m 



*COG = at ■ 2^ Xi 1^ 9j ■ 

L^i=\ 2-i3 = l 9j i=l j=l 



(11) 



A new variable G„ is introduced as the total electron 
gain: 



9j- 



(12) 



3 = 1 



Knowing the distribution of n and g and assuming 
that n and g are independent variables the mean value 
and variance of the G n can be expressed as: 



G n = ng 



cr, 



_Gn_ 

<~ T „ 



Si 

3 2 n 



(13) 
(14) 



G 



gfactor 



N- 



N-. 



E E 9i9j 
Ng~5 



N 



N(N - l)g 2 
K 2 /g 2 + l) 



Ng 2 



(8) 



Gas gain fluctuation of the gas detector working in 
proportional regime is described with the exponential 
distribution with the mean value g and r.m.s. 



CT g = 9 

Substituting a g into equation JSJ 

°gfactor TV + 1 ' 



(9) 



(10) 



Gas multiplication fluctuation in chamber deterio- 
rates (Jxcog by a factor of about y/2. The prediction 
of this model is in good agreement with results from 
the simulation. 



2.2. Secondary ionization effect 

Charged particle penetrating the gas of the detec- 
tor produces N primary electrons. Primary electron 
i produces nt, — 1 secondary electrons. Each of these 
electrons is amplified in the electric field by a factor 
oig r 

Each primary cluster is characterized by a position 
Xi with mean value x and a x . The COG given by 
equation |TJJ is modified to the following form: 



Inserting G n into equation results in an equa- 
tion similar to the equation 

Multiplicative factor GLfactor is defined as an analog 
of Ggf ac tor, from the equation J7J 



^Lfactor — N 



EG? 



E E G i Q j 



(15) 



Using the new variable G n and simply replacing 
gas gain g by G n in the similar way as in equation 
JSJ) does not work. For l/E 2 parametrization of sec- 
ondary ionization process cr 2 ^ /G n goes to infinity and 
thus o-\ coa — a 2 . Moreover G 2 and EE^iGj are 
not quasi independent as the sum EE^iGj could 
be given by one "exotic" electron cluster. Approx- 
imations used for deriving the equation (fSJ are not 
valid for secondary ionization effect. 

In order to estimate the impact of this effect on 
COG equation l|15fl has to be solved numerically. Sim- 
ulation showed that GLfactor does not depend strongly 
on the cut used for maximum number of electrons cre- 
ated in the process of secondary ionization. A change 
of the cut, from 1000 electrons up produces a change 

Of about 3% in GLfactor- 

Equation (JSJ) is not applicable in this situation be- 
cause of the infinity of the <jg- According to the 
simulation, the threshold on the number of electrons 
in the cluster has a little influence to the resulting 
GLfactor- Therefore wc fit simulated GLfactor with for- 
mula (jHJ where <J%/G was a free parameter. How- 
ever, this parametrization does not describe the data 
for wide enough range of N. In further study the lin- 
ear parametrization of the COG factor was used. This 
parametrization was validated on reasonable interval 
of N. 
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3. Center-of-gravity error parametrization 

Detected position of charged particle is a random 
variable given by several stochastic processes: diffu- 
sion, angular effect, gas gain fluctuation, Landau fluc- 
tuation of the secondary ionization, ExB effect, elec- 
tronic noise and systematic effects (like space charge, 
etc.). The relative influence of these processes to the 
resulting distortion of position determination depends 
on the detector parameters. In the big drift detectors 
like the ALICE TPC the main contribution is given 
by diffusion, gas gain fluctuation, angular effect and 
secondary ionization fluctuation. 

Furthermore we will use following assumptions: 

• A/prim primary electrons are produced at a ran- 
dom positions xt along the particle trajectory. 

• m — I electrons are produced in the process of 
secondary ionization. 

• Displacement of produced electrons due to the 
thermalization is neglected. 

Each of electrons is characterized by a random vec- 
tor Z 1 : 



(16) 



where i is the index of primary electron cluster and 
j is the index of the secondary electron inside of the 
primary electron cluster. Random variable x 1 is a po- 
sition where the primary electron was created. The 
position ijj is a random variable specific for each elec- 
tron. It is given mainly by a diffusion. 

The center of gravity of the electron cloud is given: 



1 



^COG 



iVprim Ui 



9& 



s 3 =l Hj i=l j=l 

JVprim m 



-v y x 1 y g ) 

Z-ii=l Zjj=li/j i=l j = l 



EWprim V^"i „i * ' ^ ' 
i=l Z^j = li/j j=l j=l 



N pHm m 



a^coc + ycoc- 



(17) 



The mean value zqog is equal to the sum of mean 
values fcoo and yboG- 

The sigma of COG in one of the dimension of vector 
Zicog is given by following equation 



1COG °SlCOG ^J/ICOG^ 

2 (^icog2/icog - xicogVicog) 



(18) 



If the vectors x and y are independent random vari- 
ables the last term in the equation Q18JI is equal to 
zero. 



yicoc ' 



(19) 



r.m.s. of COG distribution is given by the sum of 
r.m.s of x and y components. 

In order to estimate the influence of the ExB and 
unisochronity effect to the space resolution two addi- 
tional random vectors are added to the initial electron 
position. 



zj = + jjj +X ExB (x i + jjj) + X Unisochron (f i + $X20) 

The probability distributions of Xexb and 
A'unisochron are functions of random vectors x l 
and j/j, and they are strongly correlated. However, 
simulation indicates that in large drift detectors dis- 
tortions, due to these effects, are negligible compared 
with a previous one. 

Combining previous equation and neglecting ExB 
and unisochronity effects, the COG distortion 
parametrization appears as: 
a z of cluster center in z (time) direction 



Dt ^Drift 



(S 



tan 2 a L^Gu actor (Nchp r 



12JV, 



chprim 



(21) 



and o~ y of cluster center in y(pad) direction 



N, 



ch 



tan (3 ip ad GLfactor(iVchprim) 



127V, 



chprim 



(22) 



where N c ^ is the total number of electrons in the 
cluster, A^chprim is the number of primary electrons 
in the cluster, G g is the gas gain fluctuation factor, 
GLfactor is the secondary ionization fluctuation factor 
and dnoise describe the contribution of the electronic 
noise to the resulting sigma of the COG. 



4. Precision of cluster COG 
determination using measured 
amplitude 

We have derived parametrization using as parame- 
ters the total number of electrons N c h and the number 
of primary electrons -/Vchprim. This parametrization is 
in good agreement with simulated data, where the iV c h 
and -/Vchprim are known. It can be used as an estimate 
for the limits of accuracy, if the mean values 7V c h and 
-^chprim are used instead. 

The TVch and iVchprim are random variables de- 
scribed by a Landau distribution, and Poisson dis- 
tribution respectively . 

In order to use previously derived formulas l|2*Tl l2*2*|l . 
the number of electrons can be estimated assuming 
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their proportionality to the total measured charge A 
in the cluster. However, it turns out that an empirical 
parametrization of the factors G(N)/N = G(A)/(kA) 
gives better results. Formulas (|21[1 and (1221) are 
transformed to following form: 

a z of cluster center in z (time) direction: 



2 _ ^L^Drift G g {A) 

ZCOG ~ A k ch + 

tan 2 a Z/ 2 ad GLfactor(A) 2 

1 9 A X h~- °noise 

"-prim 

and o~ v of cluster center in y(pad) direction: 



(23) 



A 



Sell 



tan 2 /3L 2 ad G L/actor (A) 



12A 



(24) 



•'prim 



5. Estimation of the precision of cluster 
position determination using measured 
cluster shape 

The shape of the cluster is given by the convolution 
of the responses to the electron avalanches. The time 
response function and the pad response function are 
almost gaussian, as well as the spread of electrons due 
to the diffusion. The spread due to the angular effect 
is uniform. Assuming that the contribution of the 
angular spread does not dominate the cluster width, 
the cluster shape is not far from gaussian. Therefore, 
we can use the parametrization 



The fluctuation of the shape depends on the con- 
tribution of the random diffusion and angular spread, 
and on the contribution given by a gas gain fluctua- 
tion and secondary ionization. The fluctuation of the 
time and pad response functions is small compared 
with the previous one. 

The measured r.m.s of the cluster is influenced by 
a threshold effect. 



J 4(t,p)>thrcshold 



(t-t ) 2 xA(t, P ) (28) 



The threshold effect can be eliminated using two di- 
mensional gaussian fit instead of the simple COG 
method. However, this approach is slow and, more- 
over, the result is very sensitive to the gain fluctuation. 

To eliminate the threshold effect in r.m.s. method, 
the bins bellow threshold are replaced with a vir- 
tual charge using gaussian interpolation of the clus- 
ter shape. The introduction of the virtual charge im- 
proves the precision of the COG measurement. Large 
systematic shifts in the estimate of the cluster posi- 
tion (depending on the local track position relative to 
pad-time) due to the threshold are no longer observed. 

Measuring the r.m.s. of the cluster, the local dif- 
fusion and angular spread of the electron cloud can 
be estimated. This provides additional information 
for the estimation of distortions. A simple additional 
correction function is used: 



& COG 



ct C og(-4)x(1 



<5RMS , ,„„, 

const x ^T77^)> (29) 

teorRMS 7 v ; 



where ctcog (A) is calculated according formulas [52] 
andEU and the <5RMS/teorRMS is the relative dis- 
tortion of the signal shape from the expected one. 



f(t,p) 



i^Max- exp 



(t-t ) 2 (p-p f 



2a 2 



(25) 

where i^Max is the normalization factor, t and p are 
time and pad bins, to an d Po are centers of the cluster 
in time and pad direction and at and <r p are the r.m.s. 
of the time and pad cluster distribution. 

The mean width of the cluster distribution is given 
by: 



°t = \ Dl i dri ft + cr 2 r 



2 

prcamp 



tan 2 a L 2 ad 
12 



ift 



tan 2 /3L 2 ad 



y PRF 



12 



(26) 



(27) 



where <7 preamp and oprf are the r.m.s. of the time 
response function and pad response function, respec- 
tively. 



6. TPC cluster finder 

The classical approach for the beginning of the 
tracking was chosen. Before the tracking itself, 
two-dimensional clusters in pad-row-time planes arc 
found. Then the positions of the corresponding space 
points are reconstructed, which are interpreted as the 
crossing points of the tracks and the centers of the 
pad rows. We investigate the region 5x5 bins in pad- 
row-time plane around the central bin with maximum 
amplitude. The size of region, 5x5 bins, is bigger than 
typical size of cluster as the <7t and cr pac i are about 0.75 
bins. 

The COG and r.m.s are used to characterize clus- 
ter. The COG and r.m.s are affected by systematic 
distortions induced by the threshold effect. Depend- 
ing on the number of time bins and pads in clusters 
the COG and r.m.s. are affected in different ways. 
Unfortunately, the number of bins in cluster is the 
function of local track position. To get rid of this 
effect, two-dimensional gaussian fitting can be used. 
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Figure 2: Schematic view of unfolding principle. 



1.5 2 2.5 



Figure 3: Dependence of the position residual as 
function of the distance to the second cluster. 



Similar results can be achieved by so called r.m.s. 
fitting using virtual charge. The signal below thresh- 
old is replaced by the virtual charge, its expected 
value according a interpolation. If the virtual charge is 
above the threshold value, then it is replaced with am- 
plitude equal to the threshold value. The signal r.m.s 
is used for later error estimation and as a criteria for 
cluster unfolding. This method gives comparable re- 
sults as gaussian fit of the cluster but is much faster. 
Moreover, the COG position is less sensitive to the 
gain fluctuations. 

The cluster shape depends on the track parameters. 
The response function contribution and diffusion con- 
tribution to the cluster r.m.s. are known during clus- 
tering. This is not true for a angular contribution to 
the cluster width. The cluster finder should be opti- 
mised for high momentum particle coming from the 
primary vertex. Therefore, a conservative approach 
was chosen, assuming angle a to be zero. The tan- 
gent of the angle (3 is given by z-position and pad-row 
radius, which is known during clustering. 

6.1. Cluster unfolding 

The estimated width of the cluster is used as crite- 
ria for cluster unfolding. If the r.m.s. in one of the 
directions is greater then critical r.m.s, cluster is con- 
sidered for unfolding. The fast spline method is used 
here. We require the charge to be conserved in this 
method. Overlapped clusters are supposed to have the 
same r.m.s., which is equivalent to the same track an- 
gles. If this assumption is not fulfilled, tracks diverge 
very rapidly. 

The unfolding algorithm has the following steps: 

• Six amplitudes C, are investigated (see fig. |2J)- 
First (left) local maxima, corresponding to the 
first cluster is placed at position 3, second (right) 
local maxima corresponding to the second clus- 
ter is at position 5. 



• In the first iteration, amplitude in bin 4 corre- 
sponding to the cluster on left side ^4l4 is calcu- 
lated using polynomial interpolation, assuming 
virtual amplitude at and derivation at A L5 
to be 0. Amplitudes Al2 and Al3 are considered 
to be not influenced by overlap (^4l2 = C2 and 

A L 3=C 3 ). 

• The amplitude Ar<i is calculated in similar way. 
In the next iteration the amplitude is calcu- 
lated requiring charge conservation C4 = A^4 + 
Al4. Consequently 



A 



L4 



and 



,4 



R4 



C A 



A 



L4 



A 



L4 



A 



A 



R4 



R4 



A L 4 + ^R4 



(30) 



(31) 



Two cluster resolution depends on the distance be- 
tween the two tracks. Until the shape of cluster trig- 
gers unfolding, there is a systematic shifts towards to 
the COG of two tracks (see fig. |3J), only one cluster 
is reconstructed. Afterwards, no systematic shift is 
observed. 



6.2. Cluster characteristics 

The cluster is characterized by the COG in y and z 
directions (fY and fZ) and by the cluster width (fSig- 
maY, fSigmaZ). The deposited charge is described by 
the signal at maximum (fMax), and total charge in 
cluster (fQ). The cluster type is characterized by the 
data member fCType which is defined as a ratio of 
the charge supposed to be deposited by the track and 
total charge in cluster in investigated region 5x5. The 
error of the cluster position is assigned to the cluster 
only during tracking according formulas (|23[) and 1241) , 
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when track angles a and (3 are known with sufficient 
precision. 

Obviously, measuring the position of each electron 
separately the effect of the gas gain fluctuation can 
be removed, however this is not easy to implement 
in the large TPC detectors. Additional information 
about cluster asymmetry can be used, but the result- 
ing improvement of around 5% in precision on sim- 
ulated data is negligible, and it is questionable, how 
successful will be such correction for the cluster asym- 
metry on real data. 

However, a cluster asymmetry can be used as ad- 
ditional criteria for cluster unfolding. Let's denote \ii 
the z-th central momentum of the cluster, which was 
created by overlapping from two sub-clusters with un- 
known positions and deposited energy (with momenta 
Vi and 2 [ii). 

Let ri is the ratio of two clusters amplitudes: 



= Vo/(Vo + 2 Mo) 



and the track distance d is equal to 

d = Vi - 2 Mi- 

Assuming that the second moments for both sub- 
clusters are the same (°/i2 = V2 = 2 M2), two sub- 
clusters distance d and amplitude ratio n can be es- 
timated: 



R 



04) 



(i»I-VI) 3 



ri = 0.5±0.5x< 



1-A/R 



d = 



V /(4 + i?)x(^_0 /x 2 ) 



(32) 

(33) 
(34) 



In order to trigger unfolding using the shape infor- 
mation additional information about track and mean 
cluster shape over several pad-rows are needed. This 
information is available only during tracking proce- 
dure. 



6.3. TPC seed finding 

The first and the most time-consuming step in 
tracking is seed finding. Two different seeding strate- 
gies are used, combinatorial seeding with vertex con- 
straint and simple track follower. 



6.4. Combinatorial seeding algorithm 

Combinatorial seeding starts with a search for all 
pairs of points in the pad-row number il and in a 
pad-row i2, n rows closer to the interaction point 
(n = il — i2 = 20 at present) which can project to 
the primary vertex. The position of the primary ver- 
tex is reconstructed, with high precision, from hits in 




Figure 4: Schematic view of the combinatorial seeding 
procedure 



the ITS pixel layers, independently of the track deter- 
mination in the TPC. 

Algorithm of combinatorial seeding consists of fol- 
lowing steps; 

• Loop over all clusters on pad-row il 

— Loop over all clusters on pad-row z2, in- 
side a given window. The size of the win- 
dow is defined by a cut on track curvature 
(C), requiring to seed primary tracks with 
Pt above a threshold. 

* When a reasonable pair of clusters 
is found, parameters of a helix going 
through these points and the primary 
vertex are calculated. Parameters of 
this helix are taken as an initial ap- 
proximation of the parameters of the 
potential track. The corresponding co- 
variance matrix is evaluated using the 
point errors, which are given by the 
cluster finder, and applying an uncer- 
tainty of the primary vertex position. 
This is the only place where a cer- 
tain (not too strong) vertex constraint 
was introduced. Later on, tracks are 
allowed to have any impact parame- 
ters at primary vertex in both the z- 
direction and in r-ip plane. 

* Using the calculated helix parame- 
ters and their covariance matrix the 
Kalman filter is started from the outer 
point of the pair to the inner one. 

* If at least half of the potential points 
between the initial ones were success- 
fully associated with the track candi- 
date, the track is saved as a seed. 

— End of loop over pad-row 2 

• End of loop over pad-row 1 
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6.5. Track following seeding algorithm 

Seeding between two pad-rows, il and il, starts in 
the middle pad-row. For each cluster in the middle 
pad-row, the two nearest clusters in the pad-row up 
and down are found. Afterwards, a linear fit in both 
directions (z and y) is calculated. Expected prolonga- 
tion to the next two pad-rows are calculated. For next 
prolongation again two nearest clusters are found. Al- 
gorithm continue recursively up to the pad-rows il 
and i2. The linear fit is replaced by polynomial after 
7 clusters. If more than half of the potential clusters 
are found, the track parameters and covariance are 
calculated as before. 

6.6. Seed finding strategy 



Table I Combinatorial seeding efficiency and time 
consumption as a function of the distance between two 
pad-rows. 



distance 


time 


efficiency [%] 


24 


95s 


92.2 


20 


52s 


90.4 


16 


34s 


88.7 


14 


25s 


88.1 


12 


19s 


85.2 



The main advantage of combinatorial seeding is 
high efficiency, around 90% for primaries with p t > 
200MeV/c. The main disadvantage is the N 2 prob- 
lem of the combinatorial search. The N 2 problem can 
be reduced restricting the size of the seeding window. 
This should be achieved by making the distance be- 
tween seeding pad-rows smaller as the size of the win- 
dow is proportional to z2 — zl. However, decreasing the 
seeding distance, efficiency of seeding and also quality 
of seeds deteriorates. The size of the window can be 
reduced also by reducing the threshold curvature of 
the track candidate. 

However, vertex constraint suppresses secondaries, 
which should be found also. The track following seed- 
ing has to be used for them. This strategy is much 
faster but less efficient (80%). The efficiency is de- 
creased mainly due to effect of track overlaps and for 
low-p t tracks by angular effect, which correlates the 
cluster position distortion between neighborhood pad- 
rows. 

The efficiency of seeding can be increased repeat- 
ing of the seeding procedure in different layers of the 
TPC. Assuming that overlapped tracks are random 
background for the track which should be seeded, the 
total efficiency of the seeding can be expressed as 

Call = 1 Yl ^ £ *)' 



where u is a efficiency of one seeding. Repeating 
seeding, efficiency should reach up to 100%. Unfor- 
tunately, tracks are sometimes very close on the long 
path and seeding in different layers can not be consid- 
ered as independent. The efficiency of seeding satu- 
rate at a smaller value then 1. Another problem with 
repetitive seeding is that occupancy increases towards 
to the lower pad-row radius and thus the efficiency is 
a function of a the pad-row radius. 

However, in order to find secondaries from kinks 
or V0 decay, it is necessary to make a high efficient 
seeding in outermost pad-rows. On the other hand in 
the case of kinks, in the high density environment it 
is almost impossible to start tracking of the primary 
particles using only the last point of the secondary 
track because this point is not well defined. In order 
to find them, seeding in innermost pad-rows should be 
performed. In both seeding strategies, large decrease 
of efficiency and precision due to the dead zones is 
observed. Additional seeding at the sector edges is 
necessary. The length of the pads for the outermost 
30 pad-rows is greater than for the other pad-rows. 
The minimum of the occupancy and the maximum 
of seeding efficiency is obtained when we use outer 
pad-rows. In order to maximize tracking efficiency for 
secondaries it is necessary to make almost continual 
seeding inside of the TPC. Several combination of the 
slow combinatorial and the fast seeding were investi- 
gated. Depending on the required efficiency, different 
amount of the time for seeding can be spent. The de- 
fault seeding for tracking performance results was cho- 
sen as following: two combinatorial seedings at out- 
ermost 20 pad-rows, and six track following seedings 
homogenously spaced inside the outermost sector. 

More sophisticated and faster seeding is currently 
under development. It is planned to use, for seeding, 
only the clusters which were not assigned to tracks 
classified as almost perfect. The criteria for the almost 
perfect track has to be defined, depending on track 
density. 

7. Parallel Kalman tracking 

After seeding, several track hypothesis are tracked 
in parallel. Following algorithm is used: 

• For each track candidate the prolongation to the 
next pad-row is found. 

• Find nearest cluster. 

• Estimate the cluster position distortions accord- 
ing track and cluster parameters. 

• Update track according current cluster parame- 
ters and errors. 

• Remove overlapped track hypotheses, i.e. those 
which share too many clusters together. 
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• Stop not active hypotheses. 

• Continue down to the last pad-row. 

The prolongation to the next pad-row is calculated 
according current track hypothesis. Distortions of the 
local track position a y and <j x are calculated accord- 
ing covariance matrix. For each track prolongation 
a window is calculated. The width of the window is 
set to ±4cr where a is given by the convolution of the 
predicted track error and predicted expectation for 
cluster r.m.s. Clusters in the container are ordered ac- 
cording coordinates, binomial search with log(n) per- 
formance is used. The nearest cluster is taken max- 
imal probable. No cluster competition is currently 
implemented because of the memory required when 
branching the Kalman track hypothesis and because 
of the performance penalty. 

The width of the search window was chosen to take 
into account also overlapped clusters. The position 
error in this case could be significantly larger than es- 
timated error for not overlapped cluster, and the over- 
lap factor is not known apriori. On the other hand, 
the minimal distance between two reconstructed clus- 
ters is restricted by a local maxima requirement. Two 
clusters with distance less the ^2 bins (~1 cm) can 
not be observed. 

Once, the nearest cluster is found the cluster error is 
estimated using the cluster position and the amplitude 
according formulas (|24[1 and l|23|) . The correction for 
the cluster shape and overlapped factor is calculated 
according formula (|29*|) . 

The cluster is finally accepted if the square of resid- 
uals in both direction is smaller than estimated 3er. If 
this is the case track parameters are updated accord- 
ing cluster position and the error estimates. 

It may occur that the track leaves the TPC sec- 
tor and enters another one. In this case the track 
parameters and the covariance matrix is recalculated 
so that they are always expressed in the local coordi- 
nate system of the sector within which the track is at 
that moment. The variable fNFindable is defined as 
a number of potentially findable clusters. If track is 
locally inside the sensitive volume, the fNFindable is 
incremented otherwise remains unchanged. 

If there are no clusters found in several pad-rows in 
active region of the TPC, track hypothesis should be 
removed. The cluster density is defined to measure 
the density of accepted clusters to all findable clusters 
in the region, where region is several pad-rows. 

It is not known apriori, if a given track is primary 
or secondary, therefore local density can not be inter- 
preted definitely as real density. This would be true 
only for tracks which really go through all considered 
pad-rows. Tracks with low local density are not com- 
pletely removed, they are only signed (fRemoval vari- 
able) for the next analysis. 

In order to be able to remove track hypotheses 
which are almost the same so called overlap factor is 



defined. It is the ratio of the clusters shared between 
two tracks candidates and the number of all clusters. 
If the overlap factor is greater than the threshold, 
track candidate with higher ^2 or significantly lower 
number of points is removed. The threshold is pa- 
rameter, currently we use the value (in performance 
studies) at 0.6. This is a compromise between the 
maximal efficiency requirement and minimal number 
of double found tracks requirement. In the future this 
parameters will be optimized, to increase double track 
resolution. In this case a new criteria to remove dou- 
ble found tracks will have to be used. 

7.1. Double track resolution 

In the ALICE TPC represents the main challenge 
for tracking the large track density. From some dis- 
tance between two tracks the clusters are not resolved 
anymore. In our algorithm the track candidates are 
removed if some fraction of the clusters are common 
to two track candidates. There are three possibili- 
ties, if the two tracks are overlapped on a very long 
path. Either it is the same track, or the two very close 
tracks or the two tracks where one changed direction 
to the second one, and the change of the direction was 
misinterpreted as multiple scattering. 

New criteria should be defined to handle this situa- 
tion. Cluster shape can be used again for this purpose. 
If the two tracks overlap and their separation is too 
small, only one cluster is reconstructed, however, its 
width is systematically greater. Moreover, the charge 
deposited in the cluster is also systematically higher. 

Another problem is with double found clusters 
mainly at the low-p t region. There are two reasons: 

• The non gaussian tail of Coulomb scattering 
could change the direction of the track, track can 
be lost and found again during the next seeding. 

• Because of large inclination and Landau fluctu- 
ations clusters with double local maxima could 
be created. 

In order to maximize double-track resolution, and 
to minimize the number of double found tracks, the 
new criteria (mean local deposited charge and mean 
local cluster shape) are under investigation. 

7.2. dE/dx measurement 

To estimate particle mean ionization energy loss 
dE/dx, logarithmic truncated mean is used. Using 
the current cluster finder the truncation at 60% gives 
the best dE/dx resolution. Currently the amplitudes 
at local cluster maxima are used, instead of the total 
cluster charge, in order to avoid the distortion due to 
the track overlaps. Shared clusters are not used for 
the estimate of the dE/dx at all. 
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no 


(70 [mrad] 


1.399±0.030 


ere [mrad] 


0.997±0.018 




0.881±0.011 


a dEdx /dEdx[%] 


6.00±0.2 


e[%] 


99.0 



Table II TPC tracking performance (dN/dy=4000 
charged primaries) 



The measured amplitude is normalized to the track 
length, given by angles a and (3 and by the pad length. 
Specific normalization factors are used for each pad 
type as the electronic parameters (gas gain, pad re- 
sponse function) are different in different parts of the 
TPC. The normalization condition requires the same 
dE/dx inside each part of the TPC for one track. 

Correlation between the measured dE/dx and par- 
ticle multiplicity was observed. The additional cor- 
rection function for the cluster shape was successfully 
introduced, to take into account local clusters over- 
laps. 



ing task in this experiment. The track finding effi- 
ciency increases, compared to the previous attempts, 
for primary tracks by about 10%, and even more for 
secondary tracks. The main improvement is a con- 
sequence of the sophisticated cluster finding and de- 
convolution which is based on detail understanding 
of the physical processes in the TPC and the opti- 
mal usage of achievable information. Another factor 
which helped in efficiency increase, especially for sec- 
ondary tracks, is the new seeding procedure. The AL- 
ICE TPC tracker fulfil, and even exceeds the basic re- 
quirement. Further development will be concentrated 
on secondary vertexing inside TPC and possible use 
of information from other detectors. 
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Track finding and fitting algorithm in the ALICE Time projection chamber (TPC) based on Kalman-filtering 
is presented. Implementation of particle identification (PID) using dE/dx measurement is discussed. Filtering 
and PID algorithm is able to cope with non-Gaussian noise as well as with ambiguous measurements in a 
high-density environment. The occupancy can reach up to 40% and due to the overlaps, often the points along 
the track are lost and others are significantly displaced. In the present algorithm, first, clusters are found and 
the space points are reconstructed. The shape of a cluster provides information about overlap factor. Fast 
spline unfolding algorithm is applied for points with distorted shapes. Then, the expected space point error is 
estimated using information about the cluster shape and track parameters. Furthermore, available information 
about local track overlap is used. Tests are performed on simulation data sets to validate the analysis and to 
gain practical experience with the algorithm. 



1. Introduction 

Track finding for the predicted particle densities is 
one of the most challenging tasks in the ALICE ex- 
periment ?. It is still under development and here 
the current status is reported. Track finding is based 
on the Kalman-filtering approach. Kalman-like algo- 
rithms are widely used in high-energy physics experi- 
ments and their advantages and shortcomings are well 
known. 

There are two main disadvantages of the Kalman 
filter, which affect the tracking in the ALICE TPC 
?. The first is that before applying the Kalman-filter 
procedure, clusters have to be reconstructed. Occu- 
pancies up to 40% in the inner sectors of the TPC 
and up to 20% in the outer sectors are expected; clus- 
ters from different tracks may be overlapped; there- 
fore a certain number of the clusters are lost, and the 
others may be significantly displaced. These displace- 
ments are rather hard to take into account. Moreover, 
these displacements are strongly correlated depending 
on the distance between two tracks. 

The other disadvantage of the Kalman-filter track- 
ing is that it relies essentially on the determination 
of good 'seeds' to start a stable filtering procedure. 
Unfortunately, for the tracking in the ALICE TPC 
the seeds using the TPC data themselves have to be 
constructed. The TPC is a key starting point for the 
tracking in the entire ALICE set-up. Until now, prac- 
tically none of the other detectors have been able to 
provide the initial information about tracks. 

On the other hand, there is a whole list of very 
attractive properties of the Kalman-filter approach. 

• It is a method for simultaneous track recognition 
and fitting. 

• There is a possibility to reject incorrect space 
points 'on the fly', during the only tracking pass. 
Such incorrect points can appear as a conse- 
quence of the imperfection of the cluster finder. 



They may be due to noise or they may be points 
from other tracks accidentally captured in the 
list of points to be associated with the track un- 
der consideration. In the other tracking meth- 
ods one usually needs an additional fitting pass 
to get rid of incorrectly assigned points. 

• In the case of substantial multiple scattering, 
track measurements are correlated and therefore 
large matrices (of the size of the number of mea- 
sured points) need to be inverted during a global 
fit. In the Kalman-filter procedure we only have 
to manipulate up to 5 x 5 matrices (although 
many times, equal to the number of measured 
points), which is much faster. 

• Using this approach one can handle multiple 
scattering and energy losses in a simpler way 
than in the case of global methods. 

• Kalman filtering is a natural way to find the 
extrapolation of a track from one detector to 
another (for example from the TPC to the ITS 
or to the TRD). 

The following parametrization for the track was 
chosen: 

y(x)=y -^y/l-(Cx-n) 2 (1) 

z(x) = z — — ^ arcsin(Ca; — if) (2) 

The state vector x 1 is given by the local track position 
x, y and z, by a curvature C, local xq position of the 
helix center, and dip angle A: 

x T = (y, z, C, tan A, if), r\ = Cx (3) 

Because of high occupancy the standard Kalman 
filter approach was modified. We tried to find max- 
imum additional possible information which can be 
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Figure 1: Schematic view of the detection process in TPC 
(upper part - perspective view, lower part - side view). 



used during cluster finding, tracking and particle iden- 
tification. Because of too many degrees of freedom 
(up to 220 million 10-bit samples) we have to find a 
smaller number of orthogonal parameters. 

To enable using the optimal combination of local 
and global information about the tracks and clusters, 
the parallel Kalman filter tracking method was pro- 
posed. Several hypothesis are investigated in parallel. 
The global tracking approach such as Hough trans- 
form was considered only for seeding of track candi- 
dates. In the following, the additional information 
which was used will be underlined. 



the track projected onto the pad plane and pad-row is 
relevant. For the measurement of the the drift coordi- 
nate (^-direction) it is the angle (3 between the track 
and z axis (fig. ??). 

The ionization electrons are randomly distributed 
along the particle trajectory. Fixing the reference x 
position of an electron at the middle of pad-row, the y 
(resp. z) position of the electron is a random variable 
characterized by uniform distribution with the width 
L a , where L a is given by the pad length L pa d and the 
angle a (resp. (3): 



J pad 



tan a 



The diffusion smears out the position of the electron 
with gaussian probability distribution with <td- Con- 
tribution of the ExB and unisochronity effects for the 
Alice TPC are negligible. The typical resolution in the 
case of ALICE TPC is on the level of a y ~ 0.8 mm 
and a z ~ 1.0 mm integrating over all clusters in the 
TPC. 



2.1. Gas gain fluctuation effect 

Being collected on sense wire, electron is "multi- 
plied" in strong electric field. This multiplication is 
subject of a large fluctuations, contributing to the 
cluster position resolution. Because of these fluctu- 
ations the center of gravity of the electron cloud can 
be shifted. 

Each electron is amplified independently. However, 
in the reconstruction electrons are not treated sepa- 
rately. The Centre Of Gravity (COG) of the cluster is 
usually used as an estimation for the local track posi- 
tion. The influence of the gas gain fluctuation to the 
reconstructed point characteristic can be described by 
a simple model, introducing a weighted COG Acog 



Acog 



Ei=i 9i 



(4) 



where A is the total number of electrons in the cluster 
and gi is a random variable equal to a gas amplifica- 
tion for given electron. 

The mean value of Acog is equal to the mean value 
x of the original distribution of electrons 



2. Accuracy of local coordinate 
measurement 

The accuracy of the coordinate measurement is lim- 
ited by a track angle which spreads ionization and by 
diffusion which amplifies this spread. 

The track direction with respect to pad plane is 
given by two angles a and (3 (see fig. ??). For the 
measurement along the pad-row, the angle a between 



XCOG - — jf - X— - X. 

9i 2^i=l 9i 



(5) 



However, the same is not true for the dispersion of 
the position, 



Xcoa 



^■COG 



Acog 



K l~ii=\ 9% i=\ J 



x 2 
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E E x i x j9i9j _ —2 



E E 9i9j 



i E E g»gj - E E^j g»gj 



EEgigj 
Eg' 

EE 9i9j x EE gigi 



Eg 2 



^gfactor 



where 



(6) 



1 



N 



xcog = N * . E^Egj- ( n ) 

z_-/j— i i gj j— i j— i 

A new variable G n is introduced as the total electron 
gain: 



G n = ^gj- 



(12) 



r 12 

^gfactor 



TV 



Eg 2 



E E g*gj 



(7) 



The diffusion term is effectively multiplied by gas 
gain factor G g f ac tor- For sufficiently large number of 
electrons, when g\ and EE 9i9j are quasi indepen- 
dent variables, equation (??) can be transformed to 
the following 



Knowing the distribution of n and g and assuming 
that n and g are independent variables the mean value 
and variance of the G n can be expressed as: 



G r , 



ng 



CP 



n 2 g 2 n 



(13) 
(14) 



r 2 

^gf actor 



7V- 



N-. 



Eg 2 



E E 9i9j 

n7 



N 



N(N - l)g 2 + Ng 2 

K/g 2 + i) 

tt2 



N + a 2 /g 



(8) 



Gas gain fluctuation of the gas detector working in 
proportional regime is described with the exponential 
distribution with the mean value g and r.m.s. 



g 



Substituting <r g into equation (??) 



r 2 

gf actor 



27V 
N+ 1' 



(9) 



(10) 



Gas multiplication fluctuation in chamber deterio- 
rates (Jx COG by a factor of about \f2. The prediction 
of this model is in good agreement with results from 
the simulation. 



2.2. Secondary ionization effect 

Charged particle penetrating the gas of the detec- 
tor produces N primary electrons. Primary electron 
i produces n\ — 1 secondary electrons. Each of these 
electrons is amplified in the electric field by a factor 
of gj . 

Each primary cluster is characterized by a position 
Xi with mean value x and a x . The COG given by 
equation (??) is modified to the following form: 



Inserting G n into equation (??) results in an equa- 
tion similar to the equation (??). 

Multiplicative factor GLfactor is defined as an analog 
of Ggfactor, from the equation (??) 



^Lfactor — N 



EG 2 



(15) 



Using the new variable G n and simply replacing 
gas gain g by G n in the similar way as in equation 
(??) does not work. For 1/E 2 parametrization of sec- 
ondary ionization process / G n goes to infinity and 
thus o\ = c 2 - Moreover G\ and EE G«Gj are 
not quasi independent as the sum J] ^1 GiGj could 
be given by one "exotic" electron cluster. Approxi- 
mations used for deriving the equation (??) are not 
valid for secondary ionization effect. 

In order to estimate the impact of this effect on 
COG equation (??) has to be solved numerically. Sim- 
ulation showed that GLfactor does not depend strongly 
on the cut used for maximum number of electrons cre- 
ated in the process of secondary ionization. A change 
of the cut, from 1000 electrons up produces a change 

of about 3% in GLfactor- 

Equation (??) is not applicable in this situation 
because of the infinity of the oq- According to the 
simulation, the threshold on the number of electrons 
in the cluster has a little influence to the resulting 
GLfactor- Therefore we fit simulated GLfactor with for- 

2 

mula (??) where (Tq/G was a free parameter. How- 
ever, this parametrization does not describe the data 
for wide enough range of TV. In further study the lin- 
ear parametrization of the COG factor was used. This 
parametrization was validated on reasonable interval 
of TV. 
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3. Center-of-gravity error parametrization 

Detected position of charged particle is a random 
variable given by several stochastic processes: diffu- 
sion, angular effect, gas gain fluctuation, Landau fluc- 
tuation of the secondary ionization, ExB effect, elec- 
tronic noise and systematic effects (like space charge, 
etc.). The relative influence of these processes to the 
resulting distortion of position determination depends 
on the detector parameters. In the big drift detectors 
like the ALICE TPC the main contribution is given 
by diffusion, gas gain fluctuation, angular effect and 
secondary ionization fluctuation. 

Furthermore we will use following assumptions: 



primary electrons are produced at a ran- 



dom positions Xi along the particle trajectory. 

• rii — 1 electrons are produced in the process of 
secondary ionization. 

• Displacement of produced electrons due to the 
thermalization is neglected. 

Each of electrons is characterized by a random vec- 



tor Zj 



(16) 



where i is the index of primary electron cluster and 
j is the index of the secondary electron inside of the 
primary electron cluster. Random variable x 1 is a po- 
sition where the primary electron was created. The 
position ?7j is a random variable specific for each elec- 
tron. It is given mainly by a diffusion. 

The center of gravity of the electron cloud is given: 



^COG 



1 



Wo: 



r^p™ V Ki a 



j t=i j=i 



i 



i—1 l^j=\ilj i=l j=l 

1 Wprim Tli 



E^E*i + 



— E Em 



XCOG + VCOG- 



(17) 



The mean valu e jco G is equal to the sum of mean 
values fcoc and j/cog- 

The sigma of COG in one of the dimension of vector 
zicog is given by following equation 



2 _ 2 , 2 , 

ZlCOG "llCOG J/ICOG 1 " 

2 (£icog2/icog - xicogVicog) 



(18) 



If the vectors x and y are independent random vari- 
ables the last term in the equation (??) is equal to 
zero. 



rr -4- rr 

HCOG ~ l ~ 3/1COG' 



(19) 



r.m.s. of COG distribution is given by the sum of 
r.m.s of x and y components. 

In order to estimate the influence of the ExB and 
unisochronity effect to the space resolution two addi- 
tional random vectors are added to the initial electron 
position. 

= x 4 + y> + X E xb(^ + y)) + X Vnisochlon (x i + fjpO) 

The probability distributions of -Xexb and 
^Unisochron are functions of random vectors x l 
and yj, and they are strongly correlated. However, 
simulation indicates that in large drift detectors dis- 
tortions, due to these effects, are negligible compared 
with a previous one. 

Combining previous equation and neglecting ExB 
and unisochronity effects, the COG distortion 
parametrization appears as: 
a z of cluster center in z (time) direction 

„2 _ ^L-C'Drift^, + 



N ch 

tan 2 Q £ 2 ad GLfactor(-?Vchpr 



127V t 



+ < 



chprim 



(21) 



and (j y of cluster center in y(pad) direction 



D T Z/Drift ^ 

-Cr E + 



N 



ch 



tan 2 /? i 2 ad GLfactor (Chprim) 



UN, 



+ < 



chprim 



(22) 



where 7V C h is the total number of electrons in the 
cluster, iV C hp r i m is the number of primary electrons 
in the cluster, G g is the gas gain fluctuation factor, 
GLfactor is the secondary ionization fluctuation factor 
and cr n0 ise describe the contribution of the electronic 
noise to the resulting sigma of the COG. 



4. Precision of cluster COG 
determination using measured 
amplitude 

We have derived parametrization using as parame- 
ters the total number of electrons iV c h and the number 
of primary electrons iV c hprim- This parametrization is 
in good agreement with simulated data, where the iV C h 
and TVchprim are known. It can be used as an estimate 
for the limits of accuracy, if the mean values iV c h and 
^chprim are used instead. 

The 7V C h and iV C h pr i m are random variables de- 
scribed by a Landau distribution, and Poisson dis- 
tribution respectively . 

In order to use previously derived formulas (??, 
??), the number of electrons can be estimated 
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assuming their proportionality to the total measured 
charge A in the cluster. However, it turns out 
that an empirical parametrization of the factors 
G(N)/N = G(A)/(kA) gives better results. Formulas 
(??) and (??) are transformed to following form: 

a z of cluster center in z (time) direction: 



2 _ J> L L Dri ft G g {A) | 
ZcoG " A k ch + 

tan2aL l a d^G Lfactor (A) 
12A 



T 2 • 
'noise 



(23) 



and <jy of cluster center in y(pad) direction: 

_ J^Lprift G g (A) , 
<7?/COG " A X fc ch + 

.2 o r2 



12,4 



7 2 • 
noise 



(24) 



^prim 



5. Estimation of the precision of cluster 
position determination using measured 
cluster shape 

The shape of the cluster is given by the convolution 
of the responses to the electron avalanches. The time 
response function and the pad response function are 
almost gaussian, as well as the spread of electrons due 
to the diffusion. The spread due to the angular effect 
is uniform. Assuming that the contribution of the 
angular spread does not dominate the cluster width, 
the cluster shape is not far from gaussian. Therefore, 
we can use the parametrization 



The fluctuation of the shape depends on the con- 
tribution of the random diffusion and angular spread, 
and on the contribution given by a gas gain fluctua- 
tion and secondary ionization. The fluctuation of the 
time and pad response functions is small compared 
with the previous one. 

The measured r.m.s of the cluster is influenced by 
a threshold effect. 



/t(i,p)>threshold 



t ) 2 xA(t,p) (28) 



The threshold effect can be eliminated using two di- 
mensional gaussian fit instead of the simple COG 
method. However, this approach is slow and, more- 
over, the result is very sensitive to the gain fluctuation. 

To eliminate the threshold effect in r.m.s. method, 
the bins bellow threshold are replaced with a vir- 
tual charge using gaussian interpolation of the clus- 
ter shape. The introduction of the virtual charge im- 
proves the precision of the COG measurement. Large 
systematic shifts in the estimate of the cluster posi- 
tion (depending on the local track position relative to 
pad-time) due to the threshold are no longer observed. 

Measuring the r.m.s. of the cluster, the local dif- 
fusion and angular spread of the electron cloud can 
be estimated. This provides additional information 
for the estimation of distortions. A simple additional 
correction function is used: 



ctcog -> ctcog(-4)x(1 + const x 



(5RMS , 
teorRMS' 



(29) 



where acoo(A) is calculated according formulas ?? 
and ??, and the <5RMS/teorRMS is the relative dis- 
tortion of the signal shape from the expected one. 



f(t,p) = i^Max-exp 



(t - t ) 2 (p - Po f 



2a 2 



2a 2 p 



(25) 

where if Max is the normalization factor, t and p are 
time and pad bins, to and po are centers of the cluster 
in time and pad direction and at and a p are the r.m.s. 
of the time and pad cluster distribution. 

The mean width of the cluster distribution is given 
by: 



<7n 



D L^drift + O-preamp 



tan 2 a L 2 A 
+ 12^, (26) 



/ tan 2 P L 2 

D 2 L d - ' - 2 ' ' pad 



^T^drift + °PRF + 



12 



(27) 



where a n 



and (Tprf are the r.m.s. of the time 



response function and pad response function, respec- 
tively. 



6. TPC cluster finder 

The classical approach for the beginning of the 
tracking was chosen. Before the tracking itself, 
two-dimensional clusters in pad-row-time planes are 
found. Then the positions of the corresponding space 
points are reconstructed, which are interpreted as the 
crossing points of the tracks and the centers of the 
pad rows. We investigate the region 5x5 bins in pad- 
row-time plane around the central bin with maximum 
amplitude. The size of region, 5x5 bins, is bigger than 
typical size of cluster as the <r t and <7 pa d are about 0.75 
bins. 

The COG and r.m.s are used to characterize clus- 
ter. The COG and r.m.s are affected by systematic 
distortions induced by the threshold effect. Depend- 
ing on the number of time bins and pads in clusters 
the COG and r.m.s. are affected in different ways. 
Unfortunately, the number of bins in cluster is the 
function of local track position. To get rid of this 
effect, two-dimensional gaussian fitting can be used. 
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Figure 2: Schematic view of unfolding principle. 



1.5 2 2.5 



Figure 3: Dependence of the position residual as function 
of the distance to the second cluster. 



Similar results can be achieved by so called r.m.s. 
fitting using virtual charge. The signal below thresh- 
old is replaced by the virtual charge, its expected 
value according a interpolation. If the virtual charge is 
above the threshold value, then it is replaced with am- 
plitude equal to the threshold value. The signal r.m.s 
is used for later error estimation and as a criteria for 
cluster unfolding. This method gives comparable re- 
sults as gaussian fit of the cluster but is much faster. 
Moreover, the COG position is less sensitive to the 
gain fluctuations. 

The cluster shape depends on the track parameters. 
The response function contribution and diffusion con- 
tribution to the cluster r.m.s. are known during clus- 
tering. This is not true for a angular contribution to 
the cluster width. The cluster finder should be opti- 
mised for high momentum particle coming from the 
primary vertex. Therefore, a conservative approach 
was chosen, assuming angle a to be zero. The tan- 
gent of the angle (3 is given by ^-position and pad-row 
radius, which is known during clustering. 

6.1. Cluster unfolding 

The estimated width of the cluster is used as crite- 
ria for cluster unfolding. If the r.m.s. in one of the 
directions is greater then critical r.m.s, cluster is con- 
sidered for unfolding. The fast spline method is used 
here. We require the charge to be conserved in this 
method. Overlapped clusters are supposed to have the 
same r.m.s., which is equivalent to the same track an- 
gles. If this assumption is not fulfilled, tracks diverge 
very rapidly. 

The unfolding algorithm has the following steps: 

• Six amplitudes C« are investigated (see fig. ??). 
First (left) local maxima, corresponding to the 
first cluster is placed at position 3, second (right) 
local maxima corresponding to the second clus- 
ter is at position 5. 



In the first iteration, amplitude in bin 4 corre- 
sponding to the cluster on left side is calcu- 
lated using polynomial interpolation, assuming 
virtual amplitude at and derivation at A h5 
to be 0. Amplitudes A^ 2 and A^ 3 are considered 
to be not influenced by overlap (A^2 = C2 and 
A L3 = C 3 ). 

The amplitude Ar 4 is calculated in similar way. 
In the next iteration the amplitude A^ 4 is calcu- 
lated requiring charge conservation C4 = Ar 4 + 
A^4. Consequently 



A^4 — > C 4 



A 



L4 



A L4 + A 



R4 



and 



iR4 



C A 



A 



R4 



A Li + A 



R4 



(30) 



(31) 



Two cluster resolution depends on the distance be- 
tween the two tracks. Until the shape of cluster trig- 
gers unfolding, there is a systematic shifts towards to 
the COG of two tracks (see fig. ??), only one cluster 
is reconstructed. Afterwards, no systematic shift is 
observed. 



6.2. Cluster characteristics 

The cluster is characterized by the COG in y and z 
directions (fY and fZ) and by the cluster width (fSig- 
maY, fSigmaZ). The deposited charge is described by 
the signal at maximum (fMax), and total charge in 
cluster (fQ). The cluster type is characterized by the 
data member fCType which is defined as a ratio of the 
charge supposed to be deposited by the track and to- 
tal charge in cluster in investigated region 5x5. The 
error of the cluster position is assigned to the clus- 
ter only during tracking according formulas (??) and 
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(??), when track angles a and ft are known with suf- 
ficient precision. 

Obviously, measuring the position of each electron 
separately the effect of the gas gain fluctuation can 
be removed, however this is not easy to implement 
in the large TPC detectors. Additional information 
about cluster asymmetry can be used, but the result- 
ing improvement of around 5% in precision on sim- 
ulated data is negligible, and it is questionable, how 
successful will be such correction for the cluster asym- 
metry on real data. 

However, a cluster asymmetry can be used as ad- 
ditional criteria for cluster unfolding. Let's denote \Xi 
the i-th central momentum of the cluster, which was 
created by overlapping from two sub-clusters with un- 
known positions and deposited energy (with momenta 
and 2 /Xj). 

Let n is the ratio of two clusters amplitudes: 



Vo/(Vo + 2 Ha) 




and the track distance d is equal to 



Figure 4: Schematic view of the combinatorial seeding 
procedure 



the ITS pixel layers, independently of the track deter- 
mination in the TPC. 

Algorithm of combinatorial seeding consists of fol- 
lowing steps; 

• Loop over all clusters on pad-row il 



d= in 



Hi. 



Assuming that the second moments for both sub- 
clusters are the same (°H2 = 1 H2 = 2 Hz), two sub- 
clusters distance d and amplitude ratio r\ can be es- 
timated: 



R 



n = 0.5±0.5x, 



04) 



VI) 3 
1 

1 -A/R 



(4 + R)x(vl 



VI) 



(32) 

(33) 
(34) 



In order to trigger unfolding using the shape infor- 
mation additional information about track and mean 
cluster shape over several pad-rows are needed. This 
information is available only during tracking proce- 
dure. 



6.3. TPC seed finding 

The first and the most time-consuming step in 
tracking is seed finding. Two different seeding strate- 
gies are used, combinatorial seeding with vertex con- 
straint and simple track follower. 



6.4. Combinatorial seeding algorithm 

Combinatorial seeding starts with a search for all 
pairs of points in the pad-row number il and in a 
pad-row i2, n rows closer to the interaction point 
(n = il — il = 20 at present) which can project to 
the primary vertex. The position of the primary ver- 
tex is reconstructed, with high precision, from hits in 



— Loop over all clusters on pad- row «2, in- 
side a given window. The size of the win- 
dow is defined by a cut on track curvature 
(G), requiring to seed primary tracks with 
p t above a threshold. 

* When a reasonable pair of clusters 
is found, parameters of a helix going 
through these points and the primary 
vertex are calculated. Parameters of 
this helix are taken as an initial ap- 
proximation of the parameters of the 
potential track. The corresponding co- 
variance matrix is evaluated using the 
point errors, which are given by the 
cluster finder, and applying an uncer- 
tainty of the primary vertex position. 
This is the only place where a cer- 
tain (not too strong) vertex constraint 
was introduced. Later on, tracks are 
allowed to have any impact parame- 
ters at primary vertex in both the z- 
direction and in r-(p plane. 

* Using the calculated helix parame- 
ters and their covariance matrix the 
Kalman filter is started from the outer 
point of the pair to the inner one. 

* If at least half of the potential points 
between the initial ones were success- 
fully associated with the track candi- 
date, the track is saved as a seed. 

— End of loop over pad-row 2 
End of loop over pad-row 1 
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6.5. Track following seeding algorithm 

Seeding between two pad-rows, il and i2, starts in 
the middle pad-row. For each cluster in the middle 
pad-row, the two nearest clusters in the pad-row up 
and down are found. Afterwards, a linear fit in both 
directions (z and y) is calculated. Expected prolonga- 
tion to the next two pad-rows are calculated. For next 
prolongation again two nearest clusters are found. Al- 
gorithm continue recursively up to the pad-rows il 
and i2. The linear fit is replaced by polynomial after 
7 clusters. If more than half of the potential clusters 
are found, the track parameters and covariance are 
calculated as before. 

6.6. Seed finding strategy 



Table I Combinatorial seeding efficiency and time 
consumption as a function of the distance between two 
pad-rows. 



distance 


time 


efficiency [%] 


24 


95s 


92.2 


20 


52s 


90.4 


16 


34s 


88.7 


14 


25s 


88.1 


12 


19s 


85.2 



The main advantage of combinatorial seeding is 
high efficiency, around 90% for primaries with p t > 
200MeV/c. The main disadvantage is the N 2 prob- 
lem of the combinatorial search. The N 2 problem can 
be reduced restricting the size of the seeding window. 
This should be achieved by making the distance be- 
tween seeding pad-rows smaller as the size of the win- 
dow is proportional to i2 — il. However, decreasing the 
seeding distance, efficiency of seeding and also quality 
of seeds deteriorates. The size of the window can be 
reduced also by reducing the threshold curvature of 
the track candidate. 

However, vertex constraint suppresses secondaries, 
which should be found also. The track following seed- 
ing has to be used for them. This strategy is much 
faster but less efficient (80%). The efficiency is de- 
creased mainly due to effect of track overlaps and for 
low-pt tracks by angular effect, which correlates the 
cluster position distortion between neighborhood pad- 
rows. 

The efficiency of seeding can be increased repeat- 
ing of the seeding procedure in different layers of the 
TPC. Assuming that overlapped tracks are random 
background for the track which should be seeded, the 
total efficiency of the seeding can be expressed as 

Call = 1 - JJ (1 - £»), 



where e, is a efficiency of one seeding. Repeating 
seeding, efficiency should reach up to 100%. Unfor- 
tunately, tracks are sometimes very close on the long 
path and seeding in different layers can not be consid- 
ered as independent. The efficiency of seeding satu- 
rate at a smaller value then 1. Another problem with 
repetitive seeding is that occupancy increases towards 
to the lower pad-row radius and thus the efficiency is 
a function of a the pad-row radius. 

However, in order to find secondaries from kinks 
or V0 decay, it is necessary to make a high efficient 
seeding in outermost pad-rows. On the other hand in 
the case of kinks, in the high density environment it 
is almost impossible to start tracking of the primary 
particles using only the last point of the secondary 
track because this point is not well defined. In order 
to find them, seeding in innermost pad-rows should be 
performed. In both seeding strategies, large decrease 
of efficiency and precision due to the dead zones is 
observed. Additional seeding at the sector edges is 
necessary. The length of the pads for the outermost 
30 pad-rows is greater than for the other pad-rows. 
The minimum of the occupancy and the maximum 
of seeding efficiency is obtained when we use outer 
pad-rows. In order to maximize tracking efficiency for 
secondaries it is necessary to make almost continual 
seeding inside of the TPC. Several combination of the 
slow combinatorial and the fast seeding were investi- 
gated. Depending on the required efficiency, different 
amount of the time for seeding can be spent. The de- 
fault seeding for tracking performance results was cho- 
sen as following: two combinatorial seedings at out- 
ermost 20 pad-rows, and six track following seedings 
homogenously spaced inside the outermost sector. 

More sophisticated and faster seeding is currently 
under development. It is planned to use, for seeding, 
only the clusters which were not assigned to tracks 
classified as almost perfect. The criteria for the almost 
perfect track has to be defined, depending on track 
density. 

7. Parallel Kalman tracking 

After seeding, several track hypothesis are tracked 
in parallel. Following algorithm is used: 

• For each track candidate the prolongation to the 
next pad-row is found. 

• Find nearest cluster. 

• Estimate the cluster position distortions accord- 
ing track and cluster parameters. 

• Update track according current cluster parame- 
ters and errors. 

• Remove overlapped track hypotheses, i.e. those 
which share too many clusters together. 
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• Stop not active hypotheses. 

• Continue down to the last pad-row. 

The prolongation to the next pad-row is calculated 
according current track hypothesis. Distortions of the 
local track position a y and o x are calculated accord- 
ing covariance matrix. For each track prolongation 
a window is calculated. The width of the window is 
set to ±4er where a is given by the convolution of the 
predicted track error and predicted expectation for 
cluster r.m.s. Clusters in the container are ordered ac- 
cording coordinates, binomial search with log(n) per- 
formance is used. The nearest cluster is taken max- 
imal probable. No cluster competition is currently 
implemented because of the memory required when 
branching the Kalman track hypothesis and because 
of the performance penalty. 

The width of the search window was chosen to take 
into account also overlapped clusters. The position 
error in this case could be significantly larger than es- 
timated error for not overlapped cluster, and the over- 
lap factor is not known apriori. On the other hand, 
the minimal distance between two reconstructed clus- 
ters is restricted by a local maxima requirement. Two 
clusters with distance less the ~2 bins (~1 cm) can 
not be observed. 

Once, the nearest cluster is found the cluster error is 
estimated using the cluster position and the amplitude 
according formulas (??) and (??). The correction for 
the cluster shape and overlapped factor is calculated 
according formula (??). 

The cluster is finally accepted if the square of resid- 
uals in both direction is smaller than estimated 3cr. If 
this is the case track parameters are updated accord- 
ing cluster position and the error estimates. 

It may occur that the track leaves the TPC sec- 
tor and enters another one. In this case the track 
parameters and the covariance matrix is recalculated 
so that they are always expressed in the local coordi- 
nate system of the sector within which the track is at 
that moment. The variable fNFindable is defined as 
a number of potentially findable clusters. If track is 
locally inside the sensitive volume, the fNFindable is 
incremented otherwise remains unchanged. 

If there are no clusters found in several pad-rows in 
active region of the TPC, track hypothesis should be 
removed. The cluster density is defined to measure 
the density of accepted clusters to all findable clusters 
in the region, where region is several pad-rows. 

It is not known apriori, if a given track is primary 
or secondary, therefore local density can not be inter- 
preted definitely as real density. This would be true 
only for tracks which really go through all considered 
pad-rows. Tracks with low local density are not com- 
pletely removed, they are only signed (fRemoval vari- 
able) for the next analysis. 

In order to be able to remove track hypotheses 
which are almost the same so called overlap factor is 



defined. It is the ratio of the clusters shared between 
two tracks candidates and the number of all clusters. 
If the overlap factor is greater than the threshold, 
track candidate with higher %2 or significantly lower 
number of points is removed. The threshold is pa- 
rameter, currently we use the value (in performance 
studies) at 0.6. This is a compromise between the 
maximal efficiency requirement and minimal number 
of double found tracks requirement. In the future this 
parameters will be optimized, to increase double track 
resolution. In this case a new criteria to remove dou- 
ble found tracks will have to be used. 

7.1. Double track resolution 

In the ALICE TPC represents the main challenge 
for tracking the large track density. From some dis- 
tance between two tracks the clusters are not resolved 
anymore. In our algorithm the track candidates are 
removed if some fraction of the clusters are common 
to two track candidates. There are three possibili- 
ties, if the two tracks are overlapped on a very long 
path. Either it is the same track, or the two very close 
tracks or the two tracks where one changed direction 
to the second one, and the change of the direction was 
misinterpreted as multiple scattering. 

New criteria should be defined to handle this situa- 
tion. Cluster shape can be used again for this purpose. 
If the two tracks overlap and their separation is too 
small, only one cluster is reconstructed, however, its 
width is systematically greater. Moreover, the charge 
deposited in the cluster is also systematically higher. 

Another problem is with double found clusters 
mainly at the \ow-p t region. There are two reasons: 

• The non gaussian tail of Coulomb scattering 
could change the direction of the track, track can 
be lost and found again during the next seeding. 

• Because of large inclination and Landau fluctu- 
ations clusters with double local maxima could 
be created. 

In order to maximize double-track resolution, and 
to minimize the number of double found tracks, the 
new criteria (mean local deposited charge and mean 
local cluster shape) are under investigation. 

7.2. dE/dx measurement 

To estimate particle mean ionization energy loss 
dE/dx, logarithmic truncated mean is used. Using 
the current cluster finder the truncation at 60% gives 
the best dE/dx resolution. Currently the amplitudes 
at local cluster maxima are used, instead of the total 
cluster charge, in order to avoid the distortion due to 
the track overlaps. Shared clusters are not used for 
the estimate of the dE/dx at all. 
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Table II TPC tracking performance (dN/dy=4000 
charged primaries) 



The measured amplitude is normalized to the track 
length, given by angles a and f3 and by the pad length. 
Specific normalization factors are used for each pad 
type as the electronic parameters (gas gain, pad re- 
sponse function) are different in different parts of the 
TPC. The normalization condition requires the same 
dE/dx inside each part of the TPC for one track. 

Correlation between the measured dE/dx and par- 
ticle multiplicity was observed. The additional cor- 
rection function for the cluster shape was successfully 
introduced, to take into account local clusters over- 
laps. 



ing task in this experiment. The track finding effi- 
ciency increases, compared to the previous attempts, 
for primary tracks by about 10%, and even more for 
secondary tracks. The main improvement is a con- 
sequence of the sophisticated cluster finding and de- 
convolution which is based on detail understanding 
of the physical processes in the TPC and the opti- 
mal usage of achievable information. Another factor 
which helped in efficiency increase, especially for sec- 
ondary tracks, is the new seeding procedure. The AL- 
ICE TPC tracker fulfil, and even exceeds the basic re- 
quirement. Further development will be concentrated 
on secondary vertexing inside TPC and possible use 
of information from other detectors. 
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